home
***
CD-ROM
|
disk
|
FTP
|
other
***
search
/
Turnbull China Bikeride
/
Turnbull China Bikeride - Disc 2.iso
/
STUTTGART
/
TEMP
/
GNU
/
flex
/
Incompatib
< prev
next >
Wrap
Text File
|
1995-06-28
|
7KB
|
240 lines
Incompatibilities
Previous: <C++=>C> * Next: <Diagnostics=>Diagnostic> * Up: <Top=>!Root>
#Wrap on
{fH3}Incompatibilities with {fCode}lex{f} and POSIX{f}
{fCode}flex{f} is a rewrite of the AT&T Unix {fCode}lex{f} tool (the two
implementations do not share any code, though), with some
extensions and incompatibilities, both of which are of
concern to those who wish to write scanners acceptable to
either implementation. Flex is fully compliant with the
POSIX {fCode}lex{f} specification, except that when using {fEmphasis}%pointer{f}
(the default), a call to {fEmphasis}unput(){f} destroys the contents of
{fCode}yytext{f}, which is counter to the POSIX specification.
In this section we discuss all of the known areas of
incompatibility between flex, AT&T lex, and the POSIX
specification.
{fCode}flex's{f} {fEmphasis}-l{f} option turns on maximum compatibility with the
original AT&T {fCode}lex{f} implementation, at the cost of a major
loss in the generated scanner's performance. We note
below which incompatibilities can be overcome using the {fEmphasis}-l{f}
option.
{fCode}flex{f} is fully compatible with {fCode}lex{f} with the following
exceptions:
#Indent +4
- The undocumented {fCode}lex{f} scanner internal variable {fCode}yylineno{f}
is not supported unless {fEmphasis}-l{f} or {fEmphasis}%option yylineno{f} is used.
{fCode}yylineno{f} should be maintained on a per-buffer basis, rather
than a per-scanner (single global variable) basis. {fCode}yylineno{f} is
not part of the POSIX specification.
- The {fEmphasis}input(){f} routine is not redefinable, though it
may be called to read characters following whatever
has been matched by a rule. If {fEmphasis}input(){f} encounters
an end-of-file the normal {fEmphasis}yywrap(){f} processing is
done. A ``real'' end-of-file is returned by
{fEmphasis}input(){f} as {fCode}EOF{f}.
Input is instead controlled by defining the
{fCode}YY\_INPUT{f} macro.
The {fCode}flex{f} restriction that {fEmphasis}input(){f} cannot be
redefined is in accordance with the POSIX
specification, which simply does not specify any way of
controlling the scanner's input other than by making
an initial assignment to {fCode}yyin{f}.
- The {fEmphasis}unput(){f} routine is not redefinable. This
restriction is in accordance with POSIX.
- {fCode}flex{f} scanners are not as reentrant as {fCode}lex{f} scanners.
In particular, if you have an interactive scanner
and an interrupt handler which long-jumps out of
the scanner, and the scanner is subsequently called
again, you may get the following message:
#Wrap off
#fCode
fatal flex scanner internal error--end of buffer missed
#f
#Wrap on
To reenter the scanner, first use
#Wrap off
#fCode
yyrestart( yyin );
#f
#Wrap on
Note that this call will throw away any buffered
input; usually this isn't a problem with an
interactive scanner.
Also note that flex C++ scanner classes {fEmphasis}are{f}
reentrant, so if using C++ is an option for you, you
should use them instead. See "Generating C++
Scanners" above for details.
- {fEmphasis}output(){f} is not supported. Output from the {fEmphasis}ECHO{f}
macro is done to the file-pointer {fCode}yyout{f} (default
{fCode}stdout{f}).
{fEmphasis}output(){f} is not part of the POSIX specification.
- {fCode}lex{f} does not support exclusive start conditions
(%x), though they are in the POSIX specification.
- When definitions are expanded, {fCode}flex{f} encloses them
in parentheses. With lex, the following:
#Wrap off
#fCode
NAME [A-Z][A-Z0-9]\*
%%
foo\{NAME\}? printf( "Found it\\n" );
%%
#f
#Wrap on
will not match the string "foo" because when the
macro is expanded the rule is equivalent to
"foo[A-Z][A-Z0-9]\*?" and the precedence is such that the
'?' is associated with "[A-Z0-9]\*". With {fCode}flex{f}, the
rule will be expanded to "foo([A-Z][A-Z0-9]\*)?" and
so the string "foo" will match.
Note that if the definition begins with {fEmphasis}^{f} or ends
with {fEmphasis}${f} then it is {fEmphasis}not{f} expanded with parentheses, to
allow these operators to appear in definitions
without losing their special meanings. But the
{fEmphasis}<s>, \/{f}, and {fEmphasis}<<EOF>>{f} operators cannot be used in a
{fCode}flex{f} definition.
Using {fEmphasis}-l{f} results in the {fCode}lex{f} behavior of no
parentheses around the definition.
The POSIX specification is that the definition be enclosed in
parentheses.
- Some implementations of {fCode}lex{f} allow a rule's action to begin on
a separate line, if the rule's pattern has trailing whitespace:
#Wrap off
#fCode
%%
foo|bar<space here>
\{ foobar\_action(); \}
#f
#Wrap on
{fCode}flex{f} does not support this feature.
- The {fCode}lex{f} {fEmphasis}%r{f} (generate a Ratfor scanner) option is
not supported. It is not part of the POSIX
specification.
- After a call to {fEmphasis}unput(){f}, {fCode}yytext{f} is undefined until
the next token is matched, unless the scanner was
built using {fEmphasis}%array{f}. This is not the case with {fCode}lex{f}
or the POSIX specification. The {fEmphasis}-l{f} option does
away with this incompatibility.
- The precedence of the {fEmphasis}\{\}{f} (numeric range) operator
is different. {fCode}lex{f} interprets "abc\{1,3\}" as "match
one, two, or three occurrences of 'abc'", whereas
{fCode}flex{f} interprets it as "match 'ab' followed by one,
two, or three occurrences of 'c'". The latter is
in agreement with the POSIX specification.
- The precedence of the {fEmphasis}^{f} operator is different. {fCode}lex{f}
interprets "^foo|bar" as "match either 'foo' at the
beginning of a line, or 'bar' anywhere", whereas
{fCode}flex{f} interprets it as "match either 'foo' or 'bar'
if they come at the beginning of a line". The
latter is in agreement with the POSIX specification.
- The special table-size declarations such as {fEmphasis}%a{f}
supported by {fCode}lex{f} are not required by {fCode}flex{f} scanners;
{fCode}flex{f} ignores them.
- The name FLEX\_SCANNER is \#define'd so scanners may
be written for use with either {fCode}flex{f} or {fCode}lex{f}.
Scanners also include {fCode}YY\_FLEX\_MAJOR\_VERSION{f} and
{fCode}YY\_FLEX\_MINOR\_VERSION{f} indicating which version of
{fCode}flex{f} generated the scanner (for example, for the
2.5 release, these defines would be 2 and 5
respectively).
#Indent
The following {fCode}flex{f} features are not included in {fCode}lex{f} or the
POSIX specification:
#Wrap off
#fCode
C++ scanners
%option
start condition scopes
start condition stacks
interactive\/non-interactive scanners
yy\_scan\_string() and friends
yyterminate()
yy\_set\_interactive()
yy\_set\_bol()
YY\_AT\_BOL()
<<EOF>>
<\*>
YY\_DECL
YY\_START
YY\_USER\_ACTION
YY\_USER\_INIT
\#line directives
%\{\}'s around actions
multiple actions on a line
#f
#Wrap on
plus almost all of the flex flags. The last feature in
the list refers to the fact that with {fCode}flex{f} you can put
multiple actions on the same line, separated with
semicolons, while with {fCode}lex{f}, the following
#Wrap off
#fCode
foo handle\_foo(); ++num\_foos\_seen;
#f
#Wrap on
is (rather surprisingly) truncated to
#Wrap off
#fCode
foo handle\_foo();
#f
#Wrap on
{fCode}flex{f} does not truncate the action. Actions that are not
enclosed in braces are simply terminated at the end of the
line.